1 results listed
Nowadays the computing trend is very large-scale and complex such as the Internet, banking system, online payment system, security, and surveillance system are generating a large amount of data every day. From these data, the percentage of imbalance data is quite high. These imbalanced data is misguiding a machine learning model and data mining technique. Learning from imbalanced data is a new complaint that has created increasing concentration from all over the world. This imbalanced data is creating a problem in learning problem with lots of unevenly distributed class. This paper concentrates on few realistic and appropriate data preprocessing techniques and produces an appropriate class evaluation process for the imbalanced data. An empirical distinction of few well-recognized soft computing methods such as Support Vector Machine (SVM), Decision Tree Classifier (DTC), K-Nearest Neighbor (KNN) and Gaussian Naïve Bayes (GNB) are used to find Accuracy, Precision, Recall and FMeasure from an imbalanced dataset. The imbalanced data were trained after a well-known over-sampling technique named Synthetic Minority Over-sampling Technique (SMOTE), under-sampling using Cluster Centroids (CC) technique and then applied a hybrid technique named SMOTEENN which is the combination of SMOTE and Edited Nearest Neighbor (ENN). Accuracy, Precision, Recall, FMeasure and Confusion matrix are used to evaluate the performance. In this task exhibit an experimental distinction of few well-recognized classification algorithms and performance measure that is authentic for the imbalanced dataset, this results we achieved. The result shows that hybrid method redacts better than Oversampling and under-sampling techniques.
International Conference on Cyber Security and Computer Science
ICONCS
Md. Anwar Hossen
Fatema Siddika
Tonmoy Kumar Chanda
T. Bhuiyan